A comparison of broad phonetic and acoustic units for noise robust segment-based phonetic recognition
نویسندگان
چکیده
In this paper, we compare speech recognition performance using broad phoneticallyand acoustically-motivated units as a pre-processor in designing a novel noise robust landmark detection and segmentation algorithm. We introduce a cluster evaluation method to measure acoustic unit cluster quality. On the noisy TIMIT task, we find that the acoustic and phonetic segmentation approaches offer significant improvements over two baseline methods used in the SUMMIT segment-based speech recognizer, a sinusoidal model method and a spectral change approach. In addition, we find that the acoustic method has much faster computation time in stationary noises, while the phonetic approach is faster in non-stationary noise conditions.
منابع مشابه
Using Chi-Square Testing in Modeling Confusion Characteristics for Robust Phonetic Set Generation
A phonetic representation of a language is used to describe the corresponding pronunciation and synthesize the acoustic model of any vocabulary. In order to obtain better phonetic representation, context-dependent units are used to model co-articulation effects between phones and have been broadly in speech recognition. However, this representation generally increases the number of recognition ...
متن کاملApplications of broad class knowledge for noise robust speech recognition
This thesis introduces a novel technique for noise robust speech recognition by first describing a speech signal through a set of broad speech units, and then conducting a more detailed analysis from these broad classes. These classes are formed by grouping together parts of the acoustic signal that have similar temporal and spectral characteristics, and therefore have much less variability tha...
متن کاملGeneration of robust phonetic set and decision tree for Mandarin using chi-square testing
A phonetic representation of a language is used to describe the corresponding pronunciation and synthesize the acoustic model of any vocabulary. A phonetic representation with smaller phonetic units such as SAMPA-C for Mandarin Chinese and decision trees for parameter sharing are broadly applied to deal with the problem of large numbers of recognition units. However, the confusable phonetic rep...
متن کاملSignificance of group delay based acoustic features in the linguistic search space for robust speech recognition
In this paper we discuss the complementarity of the group delay features with respect to other conventional acoustic features and also propose the use of such diverse information in the linguistic search space for robust speech recognition. A discriminability analysis is carried out on various classes of phonetic units. A class based phonetic unit analysis is conducted to compare the suitabilit...
متن کاملDetection of Acoustic-Phonetic Landmarks in Mismatched Conditions using a Biomimetic Model of Human Auditory Processing
Acoustic-phonetic landmarks provide robust cues for speech recognition and are relatively invariant between speakers, speaking styles, noise conditions and sampling rates. The ability to detect acoustic-phonetic landmarks as a front-end for speech recognition has been shown to improve recognition accuracy. Biomimetic inter-spike intervals and average signal level have been shown to accurately c...
متن کامل